A tutorial on using LLM for text classification, addressing common challenges and providing practical tips to improve accuracy and usability.
This article explains BERT, a language model designed to understand text rather than generate it. It discusses the transformer architecture BERT is based on and provides a step-by-step guide to building and training a BERT model for sentiment analysis.
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and contextually rich responses.
Case study on measuring context relevance in retrieval-augmented generation systems using Ragas, TruLens, and DeepEval. Develop practical strategies to evaluate the accuracy and relevance of generated context.
This tutorial covers fine-tuning BERT for sentiment analysis using Hugging Face Transformers. Learn to prepare data, set up environment, train and evaluate the model, and make predictions.
• A beginner's guide to understanding Hugging Face Transformers, a library that provides access to thousands of pre-trained transformer models for natural language processing, computer vision, and more.
• The guide covers the basics of Hugging Face Transformers, including what it is, how it works, and how to use it with a simple example of running Microsoft's Phi-2 LLM in a notebook
• The guide is designed for non-technical individuals who want to understand open-source machine learning without prior knowledge of Python or machine learning.
Quivr is an open-source RAG framework and a robust AI assistant that helps you manage and interact with information, reducing the burden of information overload. It integrates with all your files and programs, making it easy to find and analyze your data in one place.
The paper proposes a two-phase framework called TnT-LLM to automate the process of end-to-end label generation and assignment for text mining using large language models, where LLMs produce and refine a label taxonomy iteratively using a zero-shot, multi-stage reasoning approach, and are used as data labelers to yield training samples for lightweight supervised classifiers. The framework is applied to the analysis of user intent and conversational domain for Bing Copilot, achieving accurate and relevant label taxonomies and a favorable balance between accuracy and efficiency for classification at scale.